We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in $\mathbb{R}^d$, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error $\epsilon_{\text{opt}}$ with $d^{1/3}\epsilon_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For $\epsilon_{\text{opt}} \in [d^{-1}, d^{-1/4}]$, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. We give an $(\epsilon_{\text{dp}}, \delta)$-differentially private algorithm which, given $n$ samples of Lipschitz loss functions, obtains near-optimal optimization error and makes $\min(n, n^2\epsilon_{\text{dp}}^2 d^{-1}) + \min(n^{4/3}\epsilon_{\text{dp}}^{1/3}, (nd)^{2/3}\epsilon_{\text{dp}}^{-1})$ queries to the gradients of these functions. In the regime $d \le n \epsilon_{\text{dp}}^{2}$, where privacy comes at no cost in terms of the optimal loss up to constants, our algorithm uses $n + (nd)^{2/3}\epsilon_{\text{dp}}^{-1}$ queries and improves recent advancements of [KLL21, AFKT21]. In the moderately low-dimensional setting $d \le \sqrt n \epsilon_{\text{dp}}^{3/2}$, our query complexity is near-linear.
translated by 谷歌翻译
We propose the Detailed Outline Control (DOC) framework for improving long-range plot coherence when automatically generating several-thousand-word-long stories. DOC consists of two complementary components: a detailed outliner and a detailed controller. The detailed outliner creates a more detailed, hierarchically structured outline, shifting creative burden from the main drafting procedure to the planning stage. The detailed controller ensures the more detailed outline is still respected during generation by controlling story passages to align with outline details. In human evaluations of automatically generated stories, DOC substantially outperforms a strong Re3 baseline (Yang et al., 2022) on plot coherence (22.5% absolute gain), outline relevance (28.2%), and interestingness (20.7%). Humans also judged DOC to be much more controllable in an interactive generation setting.
translated by 谷歌翻译
We present a machine-learning framework to accurately characterize morphologies of Active Galactic Nucleus (AGN) host galaxies within $z<1$. We first use PSFGAN to decouple host galaxy light from the central point source, then we invoke the Galaxy Morphology Network (GaMorNet) to estimate whether the host galaxy is disk-dominated, bulge-dominated, or indeterminate. Using optical images from five bands of the HSC Wide Survey, we build models independently in three redshift bins: low $(0<z<0.25)$, medium $(0.25<z<0.5)$, and high $(0.5<z<1.0)$. By first training on a large number of simulated galaxies, then fine-tuning using far fewer classified real galaxies, our framework predicts the actual morphology for $\sim$ $60\%-70\%$ host galaxies from test sets, with a classification precision of $\sim$ $80\%-95\%$, depending on redshift bin. Specifically, our models achieve disk precision of $96\%/82\%/79\%$ and bulge precision of $90\%/90\%/80\%$ (for the 3 redshift bins), at thresholds corresponding to indeterminate fractions of $30\%/43\%/42\%$. The classification precision of our models has a noticeable dependency on host galaxy radius and magnitude. No strong dependency is observed on contrast ratio. Comparing classifications of real AGNs, our models agree well with traditional 2D fitting with GALFIT. The PSFGAN+GaMorNet framework does not depend on the choice of fitting functions or galaxy-related input parameters, runs orders of magnitude faster than GALFIT, and is easily generalizable via transfer learning, making it an ideal tool for studying AGN host galaxy morphology in forthcoming large imaging survey.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
我们提出了一个新的框架,用于对凸函数的差异私有优化,这些功能是任意规范$ \ normx {\ cdot} $中的Lipschitz。我们的算法基于一种正规的指数机制,该机制从密度$ \ propto \ exp(-k(f+\ mu r))$中进行样品,其中$ f $是经验损失,$ r $是一种常规化器,它与强烈的convex convex converize尊重$ \ normx {\ cdot} $,将\ cite {gll22}的最新作品推广到非Euclidean设置。我们表明,这种机制可以满足高斯差异隐私,并通过使用凸几何形状的本地化工具来解决DP-MER(经验风险最小化)和DP-SCO(随机凸优化)。我们的框架是第一个在一般规范空间中适用于私有凸优化的框架,并直接恢复了镜下下降的非私有SCO率,作为隐私参数$ \ eps \ to \ infty $。作为应用程序,对于LipsChitz优化了$ \ ell_p $ norms for(1,2)$中的所有$ p \ norms,我们获得了第一个最佳隐私性权衡权衡;对于$ p = 1 $,我们提高了最近的作品\ cite {asifkt21,bassilygn21}获得的权衡,至少通过对数因素。我们的$ \ ell_p $ norm和schatten- $ p $规范优化框架与多项式时间采样器相辅相成,我们的查询复杂性明确绑定。
translated by 谷歌翻译
与单目标优化(SOO)相反,多目标优化(MOO)需要优化器才能找到Pareto Frontier,这是不受其他可行解决方案主导的可行解决方案的子集。在本文中,我们提出了Lamoo,这是一种新型的多目标优化器,它从观察到的样品中学习模型,以分区搜索空间,然后专注于可能包含帕累托前沿子集的有希望的区域。该分区基于优势数,该数字衡量了一个数据点与现有样本之间的帕累托边境的“多么近”。为了说明由于样本有限和模型不匹配而导致的可能分区错误,我们利用蒙特卡洛树搜索(MCT)利用有希望的区域,同时探索次优的区域,这些区域可能会以后可能包含良好的解决方案。从理论上讲,我们在某些假设下通过Lamoo证明了通过Lamoo进行学习空间分配的功效。从经验上讲,在Hypervolume(HV)基准上,一种受欢迎的MOO指标,Lamoo在多个现实世界中的MOO任务上大大优于强大的基线,在NASBENCH上,在NASBENCH上的神经体系结构的样品效率高达225%,对于Molecular,最高可用于10%设计。
translated by 谷歌翻译
我们研究了清单可解放的平均估计问题,而对手可能会破坏大多数数据集。具体来说,我们在$ \ mathbb {r} ^ $和参数$ 0 <\ alpha <\ frac 1 2 $中给出了一个$ $ n $ points的$ t $ points。$ \ alpha $ -flaction的点$ t $是iid来自乖巧的分发$ \ Mathcal {D} $的样本,剩余的$(1- \ alpha)$ - 分数是任意的。目标是输出小型的vectors列表,其中至少一个接近$ \ mathcal {d} $的均值。我们开发新的算法,用于列出可解码的平均值估计,实现几乎最佳的统计保证,运行时间$ O(n ^ {1 + \ epsilon_0} d)$,适用于任何固定$ \ epsilon_0> 0 $。所有先前的此问题算法都有额外的多项式因素在$ \ frac 1 \ alpha $。我们与额外技术一起利用此结果,以获得用于聚类混合物的第一个近几个线性时间算法,用于分开的良好表现良好的分布,几乎匹配谱方法的统计保证。先前的聚类算法本身依赖于$ k $ -pca的应用程序,从而产生$ \ omega(n d k)$的运行时。这标志着近二十年来这个基本统计问题的第一次运行时间改进。我们的方法的起点是基于单次矩阵乘法权重激发电位减少的$ \ Alpha \至1 $制度中的新颖和更简单的近线性时间较强的估计算法。在Diakonikolas等人的迭代多滤波技术的背景下,我们迫切地利用了这种新的算法框架。 '18,'20,提供一种使用一维投影的同时群集和下群点的方法 - 因此,绕过先前算法所需的$ k $ -pca子程序。
translated by 谷歌翻译
求解线性系统的迭代方法的收敛速率$ \ mathbf {a} x = b $通常取决于矩阵$ \ mathbf {a} $的条件号。预处理是通过以计算廉价的方式减少该条件号来加速这些方法的常用方式。在本文中,我们通过左或右对角线重构重新审视如何最好地提高$ \ mathbf {a}条件号的数十年。我们在几个方向上取得了这个问题。首先,我们为缩放$ \ mathbf {a} $的经典启发式提供了新的界限(a.k.a.jacobi预处理)。我们证明了这种方法将$ \ MATHBF {a} $的条件号减少到最佳可能缩放的二次因素中。其次,我们为结构化混合包装和覆盖了Semidefinite程序(MPC SDP)提供了一个求解器,它计算$ \ mathbf {a} $ in $ \ widetilde {o}(\ text {nnz}(\ mathbf {a})\ cdot \ text {poly}(\ kappa ^ \ star))$ time;这与在缩放到$ \ widetilde {o}(\ text {poly}(\ kappa ^ \ star))$ factors之后求解线性系统的成本匹配。第三,我们证明了足够一般的宽度无关的MPC SDP求解器将暗示我们考虑的缩放问题的近乎最佳的运行时间,以及与平均调理措施有关的自然变体。最后,我们突出了我们的预处理技术与半随机噪声模型的连接,以及在几种统计回归模型中降低风险的应用。
translated by 谷歌翻译
Existing automated techniques for software documentation typically attempt to reason between two main sources of information: code and natural language. However, this reasoning process is often complicated by the lexical gap between more abstract natural language and more structured programming languages. One potential bridge for this gap is the Graphical User Interface (GUI), as GUIs inherently encode salient information about underlying program functionality into rich, pixel-based data representations. This paper offers one of the first comprehensive empirical investigations into the connection between GUIs and functional, natural language descriptions of software. First, we collect, analyze, and open source a large dataset of functional GUI descriptions consisting of 45,998 descriptions for 10,204 screenshots from popular Android applications. The descriptions were obtained from human labelers and underwent several quality control mechanisms. To gain insight into the representational potential of GUIs, we investigate the ability of four Neural Image Captioning models to predict natural language descriptions of varying granularity when provided a screenshot as input. We evaluate these models quantitatively, using common machine translation metrics, and qualitatively through a large-scale user study. Finally, we offer learned lessons and a discussion of the potential shown by multimodal models to enhance future techniques for automated software documentation.
translated by 谷歌翻译
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.
translated by 谷歌翻译